AITopics | shap zero

SHAP zero Explains Biological Sequence Models with Near-zero Marginal Cost for Future Queries

Neural Information Processing SystemsJun-18-2026, 17:50:25 GMT

The growing adoption of machine learning models for biological sequences has intensified the need for interpretable predictions, with Shapley values emerging as a theoretically grounded standard for model explanation. While effective for local explanations of individual input sequences, scaling Shapley-based interpretability to extract global biological insights requires evaluating thousands of sequences--incurring exponential computational cost per query. We introduce SHAP zero, a novel algorithm that amortizes the cost of Shapley value computation across large-scale biological datasets. After a one-time model sketching step, SHAP zero enables near-zero marginal cost for future queries by uncovering an underexplored connection between Shapley values, high-order feature interactions, and the sparse Fourier transform of the model. Applied to models of guide RNA efficacy, DNA repair outcomes, and protein fitness, SHAP zero explains predictions orders of magnitude faster than existing methods, recovering rich combinatorial interactions previously inaccessible at scale. This work opens the door to principled, efficient, and scalable interpretability for black-box sequence models in biology.

artificial intelligence, machine learning, shap zero, (20 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
(2 more...)

Add feedback

SHAP zero Explains Biological Sequence Models with Near-zero Marginal Cost for Future Queries

Neural Information Processing SystemsJun-12-2026, 21:37:26 GMT

The growing adoption of machine learning models for biological sequences has intensified the need for interpretable predictions, with Shapley values emerging as a theoretically grounded standard for model explanation. While effective for local explanations of individual input sequences, scaling Shapley-based interpretability to extract global biological insights requires evaluating thousands of sequences--incurring exponential computational cost per query. We introduce SHAP zero, a novel algorithm that amortizes the cost of Shapley value computation across large-scale biological datasets. After a one-time model sketching step, SHAP zero enables near-zero marginal cost for future queries by uncovering an underexplored connection between Shapley values, high-order feature interactions, and the sparse Fourier transform of the model. Applied to models of guide RNA efficacy, DNA repair outcomes, and protein fitness, SHAP zero explains predictions orders of magnitude faster than existing methods, recovering rich combinatorial interactions previously inaccessible at scale. This work opens the door to principled, efficient, and scalable interpretability for black-box sequence models in biology.

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.78)

Add feedback

SHAP zero Explains Genomic Models with Near-zero Marginal Cost for Future Queried Sequences

Tsui, Darin, Musharaf, Aryan, Erginbas, Yigit Efe, Kang, Justin Singh, Aghazadeh, Amirali

arXiv.org Artificial IntelligenceDec-20-2024

With the rapid growth of large-scale machine learning models in genomics, Shapley values have emerged as a popular method for model explanations due to their theoretical guarantees. While Shapley values explain model predictions locally for an individual input query sequence, extracting biological knowledge requires global explanation across thousands of input sequences. This demands exponential model evaluations per sequence, resulting in significant computational cost and carbon footprint. Herein, we develop SHAP zero, a method that estimates Shapley values and interactions with a near-zero marginal cost for future queried sequences after paying a one-time fee for model sketching. SHAP zero achieves this by establishing a surprisingly underexplored connection between the Shapley values and interactions and the Fourier transform of the model. Explaining two genomic models, one trained to predict guide RNA binding and the other to predict DNA repair outcome, we demonstrate that SHAP zero achieves orders of magnitude reduction in amortized computational cost compared to state-of-the-art algorithms, revealing almost all predictive motifs -- a finding previously inaccessible due to the combinatorial space of possible interactions.

artificial intelligence, machine learning, shap zero, (20 more...)

arXiv.org Artificial Intelligence

2410.19236

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > Italy > Marche > Ancona Province > Ancona (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Filters

Collaborating Authors

shap zero

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

SHAP zero Explains Biological Sequence Models with Near-zero Marginal Cost for Future Queries

SHAP zero Explains Biological Sequence Models with Near-zero Marginal Cost for Future Queries

SHAP zero Explains Genomic Models with Near-zero Marginal Cost for Future Queried Sequences